FISHER'S INFORMATION FOR DISCRETELY SAMPLED LEVY PROCESSES 



By Yacine Ait-Sahalia^ and Jean Jacob 
Princeton University and Universite de Paris- 6 

This paper studies the asymptotic behavior of the Fisher information for a Levy process 
discretely sampled at an increasing frequency. We show that it is possible to distinguish not only 
the continuous part of the process from its jumps part, but also different types of jumps, and 
derive the rates of convergence of efficient estimators. 



1. Introduction. Models allowing for sample path discontinuities are of considerable interest 
in mathematical finance, for instance in option pricing [see e.g., Eberlein and Jacod (1997), Chan 
(1999), Boyarchenko and Levendorskii (2002), Mordecki (2002) and Carr and Wu (2004)], testing 
for the presence of jumps in asset prices [see Ait-Sahalia (2002) and Carr and Wu (2003)], interest 
rate modelling [see e.g., Eeberlein abd Raible (1999)], risk management [see e.g., Eberlein et al. 
(1998) and Khindanova et al. (2001)], optimal portfolio choice [see e.g., Kallsen (2000), Rachev and 
Han (2000) and Emmer and Kliippelberg (2004)], stochastic volatility modelling [see e.g., Barndorff- 
Nielsen (1997), Barndorff-Nielsen (1998), Leblanc and Yor (1998), Carr et al. (2003) and Kliippelberg 
et al. (2004)] or for the purpose of better describing asset returns data [see e.g., Mandelbrot (1963), 
Fama and Roll (1965), Mittnik and Rachev (2001), Carr et al. (2002)]. 

While these theoretical models are commonly used in mathematical finance, relatively little is 
known about the corresponding inference problem, which is a difficult one. A string of the literature 
focuses on the tail properties of stable processes to estimate the stable index [see e.g., Fama and 
Roll (1968), Fama and Roh (1971), de Haan and Resnick (1980), Dumouchel (1983), McCulloch 
(1997)]. Since Levy processes have known characteristic functions, given by the Levy-Khintchine 
formula, a method often proposed is based on the empirical characteristic function as an estimating 
equation [see e.g.. Press (1972), Fenech(1976), Feuerverger and McDunnough (1981b), Chapter 4 in 
Zolotarev (1986) and Singleton (2001)], maximum likelihood by Fourier inversion of the characteristic 
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function [see Feuerverger and McDunnough (1981a)], or a regression based on the explicit form of the 
characteristic function [see Koutrouvelis (1980)]. Some of these methods were compared in Akgiray 
and Lamoureux (1989). 

Fairly httle is known in most cases as to the optimality of statistical procedures in the presence 
of jumps. So wc consider in this paper the behavior of the Fisher information when the observations 
are generated by a Levy process X whose law depends on a parameter vector ij to be estimated. In 
light of the Cramer-Rao bound, our objective is to establish the optimality of potential estimators 
of rf, and the rate at which they will converge. While we focus on its implications for the classical 
likelihood inference problem, Fisher's information also plays as usual an important role in Bayesian 
inference or in determining the form of asymptotically most powerful tests. 

The essential difficulty in this class of problems is the fact that the density of most discretely sam- 
pled Levy processes, hence the corresponding likelihood function and Fisher's information, are not 
known in closed form. Representations in terms of special functions are available [see Zolotarev (1995) 
in terms of Meijer G— functions and Hoffmann-j0rgensen (1993) in terms of incomplete hypergeo- 
metric functions] although they do not appear to lead to practical formulae. One must therefore 
rely on numerical methods as the maximum likelihood estimator cannot be computed exactly [see 
Dumouchel (1971) for a multinomial approximation to the likelihood function, and Nolan (1997) and 
Nolan (2001)]. Therefore, there is potential value in considering alternative estimators which can 
both be computed explicitly and be rate-efficient. Indeed, in a companion paper [Ai't-Sahalia and 
Jacod (2004)], we propose estimators designed to achieve the efHcient rate that we identify in this 
paper based on the convergence properties of the Fisher information. 

Let us be more specific. The Levy process X is observed at n times A, 2A, . . . nA. Recalling that 
Xq = 0, this amounts to observing the n increments XjA — -'^(i-i)A- So when A > is fixed, we 
observe n i.i.d. variables distributed as Xa — Xq and having a density which depends smoothly 
on the parameter r/, and we are on known grounds: the Fisher information at stage n has the 
form In,A{v) — '^-^a(??)5 where /a(^) > is the Fisher information (an invertible matrix if rj is 
multi-dimensional) of the model based upon the observation of the single variable X/\ — Xq; we 
have the LAN property with rate y^; the asymptotically efficient estimators rjn are those for which 
\/n{jjn — converges in law to the normal distribution A/'(0, /a(??)~^), and the MLE does the job 
[see e.g., Dumouchel (1973a)]. 
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Things become more complicated when the time interval separating successive observations, A, 
varies, and more specifically in the limit when it becomes small. This type of asymptotics corresponds 
to a situation which is increasingly common in financial applications, where high frequency data are 
available with sampling intervals measured in seconds for many stocks or currencies. At stage n we 
have n observations, recorded at times zA„ for some time lag A„ going to 0. Equivalently, we observe 
the n increments xf = -^«A„ ~ ^{i-i)An which are i.i.d. for any given n. These increments are known 
as the log-returns when X is the log of the price of a financial asset. However the law of Xi depends 
on n, and indeed weakly converges to the Dirac mass at 0. The Fisher information at stage n still 
has the form /n,A„(?7) = nl^^{ri), but the behavior of the information /a„(^) is far from obvious. 

In order to say more about the behavior of Fisher's information, we need of course to parametrize 
the model. For the same reason that the computation of the MLE is hindered by the absence of an 
explicit density, the analysis of the Fisher information matrix is difficult. Dumouchel (1973b) and 
DTimoTichcl (1975) computed the information by numerical approximation of the densities and their 
derivatives. Such direct computation is numerically cumbersome because the series expansion for the 
density converges slowly, especially when the order of the stable process is near one. Brockwell and 
Brown (1980) propose an alternative numerical computation of the information based on a Fourier 
series for the derivatives of the density. 

In this paper, we are able to explicitly describe, in closed form, the limiting behavior of the 
Fisher information when we restrict attention to a special kind of Levy process that is relevant to 
applications in financial statistics. While our form of the process is undoubtedly restrictive, it is 
nevertheless sufficiently rich to exhibit a surprising range of different asymptotic behaviors for the 
Fisher information. In fact, we will show that different rates of convergence are achieved for different 
parameters, and for different types of Levy processes. Rates depart from the standard y/n in a number 
of different and often unexpected ways. 

Specifically, we split X into the sum of two independent Levy processes, with possibly one or two 
scale parameters. That is, we suppose that 

(1) Xt = aWt + OYt. 

Here, we have cr > and G M, and is a standard symmetric stable process with index (3 G (0, 2], 
and we are often interested in the situation where [3 = 2 and so is a Wiener process (hence 
the notation used). As for Y , it is another Levy process, viewed as a perturbation of W. In some 
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applications, Y may represent frictions that are due to the mechanics of the trading process, or in 
the case of compound Poisson jumps it may represent the infrequent arrival of relevant information 
related to the asset. In the latter case, W is then the driving process for the ordinary fluctuations 
of the asset value. Y is independent of W, and its law is either known or is a nuisance parameter, 
and is dominated by in a sense stated below. For example, when is a Brownian motion, this 
just means that Y has no Brownian part; when /? < 2, then Y could for example be another stable 
process with index a < /3, or a compound Poisson process. The parameter vector we then consider 
isr] = {a, (3,9). 

If Y is viewed as a perturbation of then our interest in studying the Fisher information lies 
in deciding whether we can estimate the parameter a, and also in some cases the index [5 (the only 
two parameters on which the law of the process aW depends) with the same degree of accuracy as 
when the process Y is absent, at least asymptotically. The answer to this question is "yes". When 
IF is a Brownian motion this means that one can distinguish between the jumps due to Y and the 
continuous part of X, and this fact was already known in the specific example of a Brownian motion 
coupled with either a Poisson or Cauchy process [see Ai't-Sahalia (2004)]. It comes more as a surprise 
when /3 < 2: we can then discriminate between the jumps due to W and those due to Y , despite the 
fact that both processes jump and we only have discrete observations. 

The paper is organized as follows. In Section El we set up the problem and define in particular 
the class of processes Y that are dominated by W . In Section |31 we study the baseline case where 
Xt = aWt and establish the properties of the Fisher information in the absence of the perturbation 
process Y. In Section |1J we characterize the set of processes Y whose presence does not affect the 
estimation of the base parameters (fi, 0) . Then we study in Section El the estimation problem for the 
dominated scale parameter 9. In this case, the results vary substantially according to the structure of 
the process Y, and we illustrate the versatility of the situation by displaying the variety of convergence 
rates that arise. 

Finally, we also briefly consider in Section El a slightly different model, where 
(2) Xt = a{Wt + Yt). 

Here it is natural to consider the law of Y and the index /? as known, and a to be the only parameter 
to be estimated. The results are again a bit unexpected, namely one can do as well as when Y is 
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absent, and in some instances (when the law of Y is sufficiently singular) the presence of Y can in 
fact help us improve the estimation of a. 

All proofs are in Section [Tj 

2. Setup. The characteristic function of Wt is 

(3) E(e^"^*) = e-*!"!''/^ 

The factor 2 above is unusual for stable processes when f3 < 2, but we put it here to ensure continuity 
between the stable and the Gaussian cases. As is well known, when /3 < 2 we have E(|VFt|'') < cxd if 
and only if < p < /?, and the tails of Wi behave according to P {Wi > u;) ~ Cf^/ j3w^ as u; ^ oo 
(and symmetrically as w — > — oo), where the constants C/j are given by 

4 r(2-/3) cos(/3^/2) It 

(4) C/3 = < 

This follows from the series expansion of the density due to Bergstr0m (1952), the duality property of 
the stable densities of order 13 and [see e.g.. Chapter 2 in Zolotarev (1986)], with an adjustment 
factor to reflect our definition of the characteristic function in ^ . 

The law of Y (as a process) is entirely specified by the law Ga of the variable Ya for any given 
A > 0. We write G = Gi, and we recall that the characteristic function of Ga is given by the 
Levy-Khintchine formula 

(5) E(e»^^) = exp A " ^ + / Hdx) (e™^ - 1 - ivxl{\,\^,y)^ 

where {b,c,F) is the "characteristic triple" of G (or, of y): b £ M is the drift of Y, and c > the 
local variance of the continuous part of Y, and F is the Levy jump measure of Y, which satisfies 
J (l A x^) F{dx) < oo [see e.g., Chapter II. 2 in Jacod and Shiryaev (2003)]. 

The fact that Y is "dominated" by W is expressed by the property that G belongs to the class 
Ql3 which we define as follows. Let first $ be the class of all increasing and bounded functions 
(f) : (0, 1] R+ having lim^^jo (Pi^) = 0- Then we set 

(6) Q{(l),a) = the set of all infinitely divisible distributions with c = and, for all x £ (0, 1], 

x°'F{[-x,x]'') < (j){x) if Q < 2 

x^Fi[-x,x]^) < cPix) and \y\^Fidy) < <^(x) if a = 2, 
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(7) = U^e* ^((A,")- 

We have liuixio x'^Fd—x, x^) = if and only if the function (j){y) = sup^^(^Q^y]X°'F{[—x,x]'^) 
belongs to whereas /||j^|<^.} \y\'^F{dy) always decreases to as x | 0, so we also have another, 
simpler, description of Ga for all a £ (0, 2]: 

(8) = \ G is infinitely divisible, c = 0, limx"F([-x, x]"") = I . 

We also have for any < x < y < 1: 

x^F{[-x, xf) < x^F{[-y, yr) + [ z^F{dz), 

■J{M<y} 

from which we deduce that x'^F{[—x,xY) — > as x | for any infinitely divisible G. Therefore, Q2 is 
indeed the set of all infinitely divisible laws G such that c = 0. Obviously a < a' implies Ga C Ga'- 
If G is a (non necessarily symmetric) stable law with index 7 it belongs to Ga for all a > 7, but not 
to Gj- If y is a compound Poisson process plus a drift, then G is in Ua>oGa- 

The variables under consideration have densities which depend smoothly on the parameters, so 
Fisher's information is an appropriate tool for studying the optimality of estimators. In the basic 
case of the model Q, the law of the observed process X depends on the three parameters a, /?, 6 
to be estimated, plus on the law of Y which is summarized by G. The law of the variable X/\ has a 
density which depends smoothly on a and 6, so that the 2x2 Fisher information matrix (relative 
to fj and 9) of our experiment exists; it also depends smoothly on /? when /? < 2, so in this case the 
3x3 Fisher information matrix exists. In all cases we denote it by /n^Anlc, /3, ^, G), and it has the 
form 

/n,A„(<T,A^,G) =nlA„(fT,A^,G), 

where I/\{a, P,0,G) is the Fisher information matrix associated with the observation of a single 
variable X^. We denote the elements of the matrix I^{a,(5,6,G) as I'^{a, P,9,G), I'^ {a, P,9,G), 
etc. We may occasionally drop G, but at this stage it is mentioned because it may appear as a 
nuisance parameter in our model and we wish to have estimates for the Fisher information that are 
uniform in G, at least on some reasonable class of G's. Let us also mention that in many cases the 
parameter /? is indeed known: this is particularly true when II/' is a Brownian motion. 
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For the other model Q the Fisher information for estimating a (a positive number here) is now 
denoted by a„('''' z^' ^'^'^ ^^^^^ form 

/U„(a,/3,G) =n4j^7,/3,G), 

with I'^{a, P,G) being the Fisher information associated with the observation of a single variable 
Xa. 

3. The baseline case: estimating the parameters of the stable process X = aW. In this 
section we consider the base case Y = 0, that is we observe the stable process X = aW with scale 
parameter o" > and index parameter f3 E (0,2]. In our general framework, this corresponds to the 
situation where G = Jq, a Dirac mass at 0, and we set the (now unidentified) parameter 9 to 0, or 
for that matter any arbitrary value. 

We have only the two parameters a and /? here, and our objective in this section is to compute 
the Fisher information matrix in this case: 



'A '^"'P'^i^uy ^A 

In future sections, we will examine how the terms in /a(o", /3, 0, G) relate to those in /a(<7, 0, 5o)• 



3.1. The scale parameter a. By the scaling property of symmetric stable processes, which says 
that Wa and A^/^H^i have the same law, it is intuitively clear that {a, 0,0, 6o) does not depend 
on A. Indeed, let us denote by hp the density of Wi, which is defined through (jSJ. The density of 
Xa = (tWa is 



Pa{x\(7,P,0,5o) 



h 



X 



aAi//3 



It is well known that hp is C°° (by repeated integration of the characteristic function), even, and 

if /3 < 2 



that its n— th derivative /ii"^ behaves as follows (the first two derivatives are denoted h' and h"): 



(9) 



h^iw) 



' c^(l+/3)(2+/3)...(n-l+/3) 



as W 



oo. 



/yV2^ if (3 = 2 

where cp is given in @; this result follows from the same series expansion as above). Let us also 
associate with hp the following functions: 

hpH'' 



(10) 



hp{w) = hp{w) + wh'p{w), hp{w) 



hp{w) 
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Then his is positive, even, continuous, and hp{w) = 0(l/|w|^^^) as \w\ — > oo, hence hp is Lebesgue- 
integrable. 

Consider now 
(11) 



= / hpiw)dw, 



which is well defined and positive. Moreover if /3 = 2 (W is then Brownian motion), /12 is Gaussian 
and we have h'2{w) = —wh2{w), so h2{w) = (1 — + w^)h2{w) and 



(12) 



13 = 2 



The Fisher information for a associated with the observation of a single variable = aW^ is 

{daPA{x\cr,f3,0,6o)f 



dx 



PA{x\a,(3,0,6o) a3AV/3 
which, in light of and by a change of variable, reduces to: 

1 



dx 



(13) 



{a, (3,0, 60 



So, as said before, this does not depend on A. In fact, T{(3) is simply the Fisher information at point 
(7 = 1 for the statistical model in which we observe aWi and Wi is a variable with density hp. 



3.2. The index parameter (3. Consider now the estimation of (3. This problem was studied by 
Dumouchel (1973a), who computed numerically the term I^{a, f3, 0, 60), including also an asymmetry 
parameter. It is easily seen that (3 hp{w) is differentiable on (0,2], and we denote by hp{w) its 
derivative. However, instead of (jUJ one has 



(14) 



hp{w) 



Ci3 log \w\ 



if /3<2 



W if ^ = 2' 



as \w\ — > cxD, by differentiation of the series expansion for the stable density. Therefore the quantity 

is finite when (3 < 2 and infinite for (3 = 2. This is the Fisher information for estimating (3, upon 
observing the single variable Wi. 
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Instead of computing the information quantities numerically based on approximations of the stable 
density hp, we study explicitly their asymptotic behavior as A ^ 0. Excluding the degenerate case 
where P = 2 [see Dumouchel (1983) for the behavior of the MLE for /3 when /3 = 2], the Fisher 
information for (3 associated with the observation of a single variable (jWa when (3 < 2 is 

, 2 

dw 



(log(A))^ x(/^) + ^ / + x:(/^), 



and the middle integral in the last display is smaller than ^JT{|3))C{|3) by Cauchy-Schwarz. Therefore, 
as A — 0, we have 



(16) li\<^.MM) ^ J_ 

^ ' (log(l/A))2 /34 ^-^f)- 



3.3. T/ie cross (cr, /3) term. As for the cross-term, when of course (3 < 2 again, we have 

1 /i^(i/;) (log(A)/i^(«;) - f3'hpiw)) 



a 13^ a J hfs[w) 



dw 



Therefore, as A —> 0, we have 



(17) \::n].r - ^ m- 



((7,/3,0,,5o) ^ 
log(l/A) ^ a/32 



3.4. T/te information for a translation model. We will see another information appear in some of 
the forthcoming formulas, namely the Fisher information associated with the estimation of the real 
number a for the model where one observes the single variable Wi + a. This Fisher information is of 
course the following number: 



Observe in particular that 



■dw. 



(19) 



P = 2 J{I3) = 1. 
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3.5. Some consequences for the estimation. If now we come back to our setting where n values (or 
increments) oi X = aW are observed along a time lag A„, we see that when /3 is known we can hope 
for estimators a„ for a which are asymptotically efficient in the sense that ^/n (an — cr) converges in 
law to Mo, a'^ /I{(3)), whatever A„ behaves like as n ^ oo, and of course the MLE satisfies that. 

When it comes to estimating /3, things are different. When A„ — > and when the true value is 
/3 < 2, we can hope for estimators converging to /? with the faster rate y/n log(l/A„) (and we provide 
such estimators in Ai't-Sahalia and Jacod (2004)). Some of these estimators are constructed using 
the jumps of the process of a size greater than some threshold, as in Hopfner and Jacod (1994). Note 
also that if we suspect that /3 = 2 we would rather perform a test, as advised by Dumouchel (1973a), 
and anyway in this case the behavior of the Fisher information does not provide much insight. 

4. The general semiparametric case. The data generating process is now given by We 
are interested in estimating (fx, /?), and in some instances 6 as well, leaving the distribution G ^Qp 
unspecified. 

4.1. Estimation of (a, (3). We start by studying whether the limiting behavior of I'^{a, P,9,G) 
when P = 2 and of the (cr, /3) block of the matrix /a(c, /?, 0, G) when /? < 2 is affected by the presence 
of Y. First, we have the intuitively obvious majoration of Fisher's information in presence of Y by the 
one for which Y is absent. Note that in this result no assumption whatsoever is made on Y (except 
of course that it is independent of W): 

Theorem 1. For any A > we have 

(20) r^^{a,2,e,G)<Il^{a,2,0,6o) 

and, when (3 < 2, the difference 

I^^ia, P, 0, 6o) r/ia, P,0,6o)\ _ f I^ia, /3, 9, G) if (a, /?, 6, G) \ 
ll^{a,(3,0,6o) (a,/?,0,5o)/ (a, /3, 0, G) 1^^ {a, (3,9,G) ) 

is a positive semi-definite matrix, and in particular we have: 

(21) /f (^7,/3,0,G) </f (a,/?,0,5o). 

Next, how does the limit as A ^ of /a(o", /3, 0, G) compare to that of Ia{(^, P, 0, 5o)7 For instance, 
given that in the absence of Y we can estimate a with information X(/3) /o"^ , we would like to find 
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out what is the impact, if any, of the presence of Y on the information we can gather about that 
parameter from the discrete observations where W is perturbed by Y : 



for i = 1, . . . , n. 



The answer to that question is given by the following. 
Theorem 2. ajlfGe Qfj we have as A — 0; 



(22) 

and also, when (3 <2 
(23) 



1 



(log(l/A)2 log(l/A) 



b) For any (j) E ^ and a G (0, /3] and K > 0, we have as A ^ 0: 



(24) 



^^PGeg{<i>,a),\e\<K 
(3<2 ^ 



Il-{a,P,e,G)- 

^^PG£g(<t>,a),\e\<K 

[ s^PGeg{<i>,a),\e\<K 



0, 



(<7,/3,g,G) _ 1(13) 
(log(l/A))^ ^ 

Il''(a,l3,e,G) _ 
log(l/A) 



0, 
0. 



c) For each n, let G^ be the standard symmetric stable law of index On, with an a sequence strictly 
increasing to p. Then for any sequence A„ — ^ such that (/? — «„) log A„ ^0 (i.e. the rate at which 
An —>■ is slow enough), the sequence of numbers I'^^{a, (3, 9, G") (resp. I^^{o; P, 9, G"')/(log(l/An))^ 
when further P < 2) converges to a limit which is strictly less than1{P)/o^ (resp. I{P)/P^). 

In other words, at their respective leading orders in A, the presence of Y has no impact on the 
information terms I^, and as soon as F is "dominated" by W: so, in the limit where 

A ^ 0, the parameters a and P can be estimated with the exact same degree of precision whether 
Y is present or not. Moreover, part (b) states the convergence of Fisher's information is uniform on 
the set Q{(p,a) and |^| < K for all a G [0, this settles the case where G and 9 are considered as 
nuisance parameters when we estimate a and p. 

But as a tends to P, the convergence disappears, as stated in part (c). This shows that the class 
Qp is effectively the largest one for which the presence of a F process does not affect the estimation 
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of the parameters of the process aW. For example, if /3 = 2, in part (c) take to be the symmetric 
stable law with index G (0, 2) and scale parameter s in the sense that its characteristic function 
is n I— > exp (^—^\u\°^"^ ■ Then if a„ — > 2, for all sequences A„ satisfying (2 — a„) log A„ — > we 
have 

2 

This is of course to be expected, since in the limit we are observing \/o^~+~? W, and we supposedly 
know s and wish to estimate a. 

Another interesting feature, due to the fact that A ^ 0, is that the limiting behavior of and 
/^^ when (3 < 2 involves but not /C(/3), as one could have guessed at first glance. 

4.2. Estimation of 9. For the entries of Fisher's information matrix involving the parameter 9, 
things are more complicated. First, observe that I^{0, /3, 9, G) (that is the Fisher information for the 
model X = 9Y) does not necessarily exist, but of course if it does we have an inequality similar to 
for all a: 

(25) li%a,P,e,G)<li'{0,(3,e,G). 

Contrary to (|2()j) . however, this is a very rough estimate, which does not take into account the 
properties of W. The {9, 0)-Fisher information is usually much smaller than what the right side above 
suggests, and we give below a more accurate estimate when Y has second moments, but without the 
"domination" assumption that G £ Qp. Recall the notation (|18|) . 

Theorem 3. IfYi has a finite variance v and a mean m, we have 

(26) li^(a, /3, e,G)<^ (m2 A2-2//3 + vA'-VP^ . 

This estimate holds for all A > 0. The asymptotic variant, which says that 

(27) hmsup A2//5-1 l'i{a,(3,9,G) < 

is sharp in some cases and not in others, as we will see in the examples below. These examples will 
also enlighten the fact that the "translation" Fisher information comes into the picture here. 
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5. Examples. The calculations of the previous section involving the parameter 9 can be made 
fully explicit if we specify the distribution of the process Y, in some cases at least. We will always 
suppose that (3 is known in these examples. 

5.1. Stable process plus drift. Here we assume that Yt = t, so Ga = and G = 5i (recall that 
the notation 6 means a Dirac mass): 

Theorem 4. The 2x2 Fisher information matrix for estimating (a, 9) is 
( 



(28) 



r^{a,l3,9,5i) Il'{a,P,9,6i) 
\ll^{a,[3,9,5i) r^-{a,p,9,S,] 



1 



^I{P) ^ 

^ j(/3)^ 



This has several interesting consequences (we will denote by T„ = nA„ the length of the observation 
window): 

1. If 9 is known, one may hope for estimators a„ for a such that \/n{an — a) — ^ AA(0, a"^ /Z{j3)) 
(that is, asymptotically efficient in the Cramer-Rao sense). As a matter of fact, in this setting, 
observing xf is equivalent to observing x'f = ~ CL^n , so we are in the situation of Section OJ 

2. If a is known, one may hope for estimators 9n for 9 such that ^/nAn ^^^{9n — 9) converges 
in law to ^/{O, a'^ / ^{(3)). If /3 = 2 the rate is thus this is in accordance with the well 
known fact that for a diffusion the rate for estimating the drift coefficient is the square root of 
the total observation window, that is \/J\^ here; moreover in this case, the variable Xt^/T^ is 
N(9, a"^ /Tn)] so 9n = XTn/Tn is an asymptotically efficient estimator for 9 (recall that J{I3) = 1 
when (3 = 2). When /5 < 2 we have 1 — 1/(3 < 1/2, so the rate is bigger than ^/T^, and it increases 
when P decreases; when /? < 1 this rate is even bigger than ^/n. 

Observe that here Yi has mean m = 1 and variance v = 0: so the estimate ()26() is indeed an 
equality. The fact that the translation Fisher information J {(3) appears here is transparent. 

3. If both a and 9 are unknown, one may hope for estimators an and 9n such that the pairs 
iV^C^n - 0-), ^/EAi'^^'^{9n - 9)) converge in law to the product AA(0, a"^ /1{(3))(^M{Q, g'^IJ{(3)). 



5.2. Stable process plus Poisson process. Here we assume that y is a standard Poisson process 
(jumps of size 1, intensity 1), whose law we write as G = P. We can describe the limiting behavior 
of the {a, 9) block of the matrix I/\{a, (3, 9, P) as A ^ 0. 
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Theorem 5. IfY is a standard Poisson process we have, as A — > 0; 

(29) Il^a,(3,e,P) ^ ^I{(3) 

(30) A^/l^-^/^ I^^{a,p,e,P) ^ 

(31) A^/^-' l'^{a,P,e,P) ^ ^JiP) 

Since P £ Q^, (|29() is nothing else than the first part of (|22() . One could prove more than ()3U() . 
namely that sup^ A^/'^^^ \I^{a, /?, 0,P)\ < oo. Here again, we deduce some interesting consequences: 



1. If a is known, one may hope for estimators for 9 such that ^ nAn (0„ — 9) converge 
in law to N'{0,a'^ / J{(3)). So the rate is bigger than y/n, except when (5 = 2. More gener- 
ally, if both a and 9 are unknown, one may hope for estimators a„ and 0„ such that the 
pairs (cr^ c), "sj nAn ^ (^n 

6*) converge in law to the product N{<d,a'^ ® 

2. However, the above-described behavior of any estimator 9n cannot be true when r„ = nA„ does 
not go to infinity, because in this case there is a positive probability that Y has no jump on the 
biggest observed interval, and so no information about 9 can be drawn from the observations 
in that case. It is true, though, when r„ oo, because Y will eventually have infinitely many 
jumps on the observed intervals. This discrepancy between the asymptotic behavior of Fisher 
information and of estimators shows that some care must be taken when the Fisher information 
is used as a measure of the quality of estimators. 

3. Observe that here Yi has mean m = 1 and variance v = 1. So in view of (|3H) the asymptotic 
estimate (|27|) is sharp. 



5.3. Stable process plus compound Poisson process. Here we assume that 1" is a compound Poisson 
process with arrival rate A and law of jumps /u: that is, the characteristics of G are b = X /||^|<]^| xfi{dx) 
and c = and F = Xfi. We then write G = P\,fi, which belongs to Qp. 

We will further assume that has a density / satisfying: 



(32) 



lim uf{u) = 0, sup(|/'(tt)|(l + |u|)) < oo. 

|u|-^oo u 
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We also suppose that the "multiphcative" Fisher information associated with /x (that is, the Fisher 
information for estimating 6 in the model when one observes a single variable 9U with U distributed 
according to /i) exists. It then has the form 

(33) £./<!f£(4±iM)!,„. 



/(«) 

We can describe the limiting behavior of the (cr, 6) block of the matrix I^{(t, /?, 9, Px.fi) as A — > 0. 

Theorem 6. If Y is a compound Poisson process satisfying and such that C in is finite, we 
have as A ^ 0; 

(34) Il%a,P,e,Px,^) ^ \l{fi) 

(35) -^Il\a,(3,e,Px,^) ^ 
and 

(36) ^ / (^^^(^) + dx < hminf i /i^a, /?, P,,^) < limsup 1 /i^(a, /?, G) < C 
when [3 <2 (cp is the constant defined in and also, when [3 = 2: 



(37) i/i''(a,/3,0,PA,^)^l£. 



As for the previous theorem, 1)34(1 is nothing else than the first part of (|22() . We could prove more 

A 



than (jnSI)) namely that sup^ ^ {a, P,9, Px^fj,)\ < oo. Here again, we deduce some interesting 



consequences: 

1. One may hope for estimators 9n for 6 such that \fT^{Qn — 9) is tight (the rate is the same as 
for the case Yt = t), and is even asymptotically normal when /3 = 2. 

2. However, this is not true when T„ does not go to infinity, for the same reason as for the previous 
theorem. 

3. When the measure /U has a second order moment, the right side of (|26|) is larger than the result 
of the previous theorem, so the estimate in Theorem 01 is not sharp. 



The rates for estimating 9 in the two previous theorems, and the limiting Fisher information as 
well, can be explained as follows (supposing that a is known and that we have n observations and 
that Tn — > oo): 
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1. For Theorem |SJ comes into the picture whenever the Poisson process has a jump. On the 
interval [0, T„] we have an average of T„ jumps, most of them being isolated in an interval 
(iA^, (z + 1)A„]. So it essentially amounts to observing T„, (or rather the integer part [T„]) 
independent variables, all distributed as a/S^^Wi + 6. The Fisher information for each of 
those (for estimating 0) is J(/3)/o"^A^/^, and the "global" Fisher information, namely nl^^, is 
approximately TnJ{P)/a^Al^^ ~ J{(3)/a^ aI^'^~\ 

2. For Theorem El Again 9 comes into the picture whenever the compound Poisson process has a 
jump. We have an average of AT„ jumps, so it essentially amounts to observing AT„ independent 
variables, all distributed as aA^/^Wi + 9V where V has the distribution /u. The Fisher informa- 
tion for each of those (for estimating 9) is approximately L/9'^ (because the variable aAj/^Wi 
is negligible), and the "global" Fisher information nl^^ is approximately XTnL9'^ ~ nA„L/^^. 
This explains the rate in (|36|) . and is an indication that l|37|) may be true even when /? < 2, 
although we have been unable to prove it thus far. 



5.4. Two stable processes. Our last example is about the case where Y is also a symmetric stable 
process with index a, a < (3. We write G = Sa- Surprisingly, the results are quite involved, in the 
sense that for estimating 9 we have different situations according to the relative values of a and /?. 
We obviously still have (|34() . so we concentrate on the term and ignore the cross term in the 
statement of the following theorem: 

Theorem 7. IfY is a standard symmetric stable process with index a < (5, we have as A — > 0; 
,38, , = 2 ^ 



^^^"'-'"'-"^ ■ ^2-a^"(2(/?-a))°/2 



/3<2, -J^ I^^%a,f3,9,Sa) 

^ A~5— 



'''' -^-^1' M^T^ 



^ A^^ log(l/A) ^^Z^"" 

a 1 2 2fl2a-2 r -i 

(41) /3<2, a<^ -r lAi^,P,G,Sa) ^ / , . -—j—^dz 
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Then if a is known one may hope to find estimators On for 9 such that Un{0n ~ ^) converges in law 
to N{0,V), with 



13 -a 

- (log(l/A„)"/4 " P 

fi-a 

Un = yf^^J' if /3 < 2, a > /3/2 

un = ^ v^log(l/A„) if (5<2, a = (5/2 

Un = if /3 < 2, a < /3/2. 
and of com'se the asymptotic variance V should be the inverse of the right hand sides in (|38|) - H41() . 



6. The multiplicative model. Another interesting situation is the model @, when [i is known. 
If we observe a single variable X/\, the corresponding Fisher information is obviously 

4 (a, /?, G) = Il%a, 13, a, G) + 2Il\a, /?, a, G) + f^ia, /?, a, G). 

We will not develop a full theory here, but we translate the examples of the previous section in this 
setting. In view of the previous results, the proofs of the next three theorem follow, and these results 
show the variety of situations we may encounter for this multiplicative model. The problems dealt 
with in Theorems IHl llUI below have been solved previously by Far (2001) and Jedidi (2001). 

Theorem 8. IfYt = t the Fisher information for estimating a in the model (0) satisfies, as A — > 0.' 

l'^{a,p,5i)^^I{P) if {1,2] 

I'^{a,l3,5i)^\{l{0)+jm ifP=l 
A^/^~^I'^{a,(3,6i) ^ ^ JiP) /3e (0,1). 

So in this situation we may hope for estimators a„ for a such that 

V^(a„ - a) ^ AA (o, ^) if /?G(1,2] 

^/f^(an -a) M (^0, jpj^^^) if /? = 1 

V^A^'/^(a„-a) ^ AA(o,^) if/3G(0,l). 
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Theorem 9. If Y is a standard Poisson process, the Fisher information for estimating a in the model 
(0j satisfies, as A ^ 0: 

l'^{a,p,P)^\{l{(3)+J{P)) if 13 = 2 

A2//3-i4(a,/?,P) ^ \ J{(3) if (3g (0,2). 

So in this situation, and as soon as T„ ^ oo, we may hope for estimators for a such that 




if f3 = 2 



T^^aF^ (a„ - a) ^ (o, ^) if/3G(0,2). 

Theorem 10. IfY is a compound Poisson process satisfying h'j'A) and such that L in is finite, 
and also ifY is a symmetric stable process with index a < (3, the Fisher information for estimating 
a in the model ^ satisfies, as A ^ 0: 

So in this situation, and as soon as T, 

Vn(a„ 

7. Proofs. 



— > cx), we may hope for estimators On for o such that 



7.1. Preliminaries about the class G{(j), a). In the sequel, we denote by C-y a constant depending 
only on the parameter 7, and which may change from line to line. 

Lemma 1. Let (j) £ ^ and a € (0,2]. There is an increasing function (j)^ '■ (0,1) i?+ having 
lima;jo <pa = and (p < (pa on (0, 1], such that for all G S G((/>, a) and e E (0, 1] we have 



(42) 



{\x\<e} 



\x\'iF{dx) < ^ ^ 



^ (/>«(£) if q>a 

(j)a{e) if q = a = 2, 



(43) 



{e<|x|<l} 



\x\F{dx) < < 



0«(1) if a <l 

'/'a(e) log(l/e) if a = l 



e e 



l-a 



if a > 1. 
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Proof. First we define (pa as follows, for x G (0, 1): 



19 



0M 

l-a 



X + 



0W 



+ (A 1 A 



Vlog(l/a;) 



a— 1 a—1 

It is clear that (pai^) — > as x | 0, and that </> < (/>o, on (0, 1) 



if a < 1 
if a = 1 
if a > 1. 



H42p when g = a = 2 is trivial because (/> < i?i>Q,. When q > a, Fubini Theorem and © yield: 



{\x\<e} 



q — a 



^F{dx) = I F{dx) q / y'^'^dy = q y''~'^F{\x\ > y) dy 

J{\x\<e} Jo Jo 

Jo 

because <j) is increasing: so we get (|42|) again. 

In a similar way, for every z £ [e, 1] we get 

r r r\^\ 

/ \x\F{dx) = / F{dx) / dy 

J{e<\x\<l} J|e<|x|<l| Jo 



re i-z pi 

I F{£ < \x\ <l) dy+ I F{y < \x\ < 1) dy + / F{y < \x\ < 1) dy 

Jo Je Jz 



< cp{e)e'-" + 0(z) J y-" dy + ^{1) J dy, 

Then in view of our definition of 0^, a simple calculation allows to deduce ()43|). upon taking z = 1 
when a < 1, and z = 1 when a = 1 and e > 1/e, and z = exp — Y^log(l/e) if a = 1 and e < 1/e, and 
z = -y/e when a > 1. □ 

In view of (|42|1 . for any pair {G,a) such that G £ Ga we can introduce the following notation: 



(44) 6'(G,a) = < 



6 



ZA(a,/3) := A-i/^ (Ya - 6'(G,a)A) . 

if a > 1, 

and we let G'^ ^ ^ denote the law of Zj\ict,(3). Part (c) of the forthcoming lemma is not fully used 
here, but will be in the companion paper on estimation. 

Lemma 2. a) If G £ Gp then G'^^^ converges to the Dirac mass 5o as A ^ 0. 

b)Ifct < (3 and cp £ ^ and is a sequence of measures in G{(p, a) and A„ 0, then the associated 
sequence G'^^ ^ ^ converges to the Dirac mass 5o as n —> oo. 
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c) If a < (3 and (/> E <I> there is a constant C = Ca such that for all functions g with \g{x)\ < i^(lA|x|) 
and all A G (0, 1], iwe have (with (f)^ like in the previous lemma): 



(45) 



E(|^(Za (a, /?))!) < CK/\1^^) 



2+13 



Proof. Observe that (c)^(b)^(a), so we prove (c) only. 

Let r] £ (0,1/2] to be chosen later. For any given G £ G{K,a) we associate the Levy process 
Y and the characteristics (b,0,F). Let F' and F" be the restrictions of F to the sets [—r],r]] and 
[—ri,rj]'^ respectively. We can decompose Y into the sum Yt = at + Y^ + Yf , where Y' is a Levy 
process with characteristics (0, 0, F') and Y" is a compound Poisson process with Levy measure F" , 
and a = b — /{^<|a;|<i}. xF{dx). Then a' = a — h'(G, a) is (recall (^U): 



/{|x|<„} xF{dx) 



if a < 1 



Therefore (|l2|) and (|l3|) yield (for a constant C = Ca not depending on G £ G{(j), a)): 

Crj^-^Mv) if a / 1 

Clog{l/rj) Mv) if a = l. 



(46) 



\a'\ < < 



Also, since Y' has no drift, no Wiener part, and no jump bigger than 1, one knows (by differentiating 
© for example) that E{{Y^f ) = t J x'^F'{dx). Then ^ again yields for some C — Ca- 



(47) 



E(|yAn <GArj 



2-a, 



We set Za = Z^{a^(3). Since \g\ < K we have \g{Zi\)\ < K. If further Y'^ = 0, we have also 
Fa = aA + y^, hence Za = A-V/3(y^ + a'A), hence |c/(Za)| < i^A-i/^(|y^| + A|a'|). Now, we have 
P(y^ / 0) < AF"{1R) < A(t)a{v)/v" because G £ G{(t),a). Therefore we deduce from ^ and ^ 
that for some constant C = Gk a- 



E(|<7(^a)|) < <^ 



C7i^ (At?-! + AV2-i//3^i/2 + Ai-V/9iog(l/ry)) 0i(r7) ^ = I 
CK (Ar?-" + Ai/2-i//3^i-"/2 + Ai-i//3^i-a^ otherwise 
as soon as G £ G{(t>,a). Then take r] = a(2+/3)//3(2+") to get 



□ 
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7.2. Fisher's information when X = aW + OY. ^From independence of W and Y, the density of 
Xa in is the convolution (recall Ga = L{Y/^)): 

(48) PA{x\a,P,9,G) = ^J G^{dy)h, . 

We now seek to characterize the entries of the full Fisher information matrix. Since /i^ and hp and 
hp are continuous and bounded, we can differentiate under the integral in H48|) to get 

(49) d,p,,{x\a,M,G) = I GA{dy) hp (^^) , 



(50) dppA{x\a, f3, e, G) = VA{x\a, (5, 9, G) - ^ daPA{x\a, f3, 6, G), 



1 f , f X — 9y\ 

(51) dePA{x\(J,(3,e,G) = --^^^ \ G^^dy) y hp ( ^^173 ) > 



where 



(52) VA{x\a,P,9,G) = ^/ G^idy) hp . 

The entries of the (a, 9) block of the Fisher information matrix are (leaving implicit the dependence 
on {a,f3,9,G)): 

(53) /r = / dx, If = I ^-PA(x)a,PA(x) ^ f dePAixl 

J Pa{x) J Pa{x) J Pa{x) 

When /3 < 2, the other entries are 



(r,A\ - (y^og^ j^a rise _ jpe ^logA e 

\^^) -'a - ^2 — -'A ' ^A - ^A ^2 — -'A > 

jPP _ 7/3/3 2(7 log A p a'(logA)2 
(^^^Z ^A — "^A ^2 "^A ^4 ^A ; 



where 

T/3 _ /■ <9aPA(a;)t^A(3;) t/3/3 _ /" ^a(x)^ jPe _ [' VA{x)d0PA(.x) 



(56) Jf = / ^-^a(x).a(x) ^^^ Jf=/!^dx, Jf=/. 

J pa(2;) ^ 7 PA(a;) 7 



PA (a;) 



dx. 



7.3. Proof of Theorem^ The proof is standard, and given for completeness, and given only in 
the case where (3 <2 (when (3 = 2 take u = below). What we need to prove is that, for any u,v £ R, 
we have 

{udaPA{x\a, (5, 9, G) + vdpp a{x\(t, jd, 9, G)Y 



PA{x\a,f3,9,G) 



dx 



. {udaPA{x\a, P, 0, 60) + vdppA{x\a, P, 0, do)f 
^ ^ - ' pa{xW,PASo) 
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We set 

q{x) = pA{x\a,f3,9,G), qo{x) = pA{x\a, 0,0, 60), 
r{x) = udcrPA{x\a;P,9,G)+vdi3PA{x\a,P,9,G), ro(x) = udaPA{x\a, P,0, 5o)+vdj3PA{x\a, l3,0,6o) 
Observe that by (|15)). 

q{x) = j GA{dy)qo{x - By), 

hence 

r{x) = j GAidy)ro{x - 9y) 

as well. Apply Cauchy-Schwarz inequality to Ga with ro = ^/qo (j^oZ-v/^o) *° 

ro{x - 9yY 



Then 



r{xy<q{x) GA{dy) , 

J qo{x-9y) 

K^)' /"^^ /"^ ./^.A M^-Oy? 



dx < dx GA{dy) 



q{x) J J qo{x-9y) 

by Fubini and a change of variable: this is exactly H57() . 



90(2:) 



7.4. Proof of Theorem\^ Clearly (b) implies (a). If (b) fails, one can find (j) £ ^ and a £ (0, /3] 
and e > 0, and also a sequence A„ and a sequence G" of measures in a) and a sequence of 
numbers 0„ converging to a limit 9, such that 



I3<2 ^ 
for all n. 



lZ{^,(5,9n,G^) 



(3 = 2 : 



> e. 



+ 



lZ{<r,f5,9n,G^) I{(3) 



(log(l/A))^ 



+ 



log(l/A) 



> e 



In other words, to prove (a) and (b) it is enough to prove the following: let (p £ ^ and a £ (0, /?] 
and An and 9n ^ 9 and G" be a sequence in G((/>, a); then we have: 



(58) 



ITM,P,On,Gn 



(59) 



/3 < 2 



,G") T(/?) /X!(a,/3,0„,G") T(/J) 



(log(l/A))2 



/34 



log(l/A) 
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Let us proceed to proving (|58() . The change of variable x <-> (x — 

e6'(G",a))/(jAi/<« in ^ leads 

to r^{a,P,e,G) = ^ J SA,e,G{^)dx, where 

J GA^^j^[du)hf3{x -u9/a) 
Since hjs and hjs are continuous and bounded, we deduce from Lemma 121 that 

j G"^nA„,a,i3{du)hi3{x - uBn/cr) hfi{x), j G'"'nA„,a,f3{du)h{x - uOn/cr) h{x). 

Thus sa„,6»„,G"(2;) ~^ hp{x) for all x, and Fatou's Lemma yields 

\miuiilZ{a,(5,en,G^) > j hp[x)dx = 

This, combined with ^ and (O and gives (|58|). 



Now suppose that j3 <2 and recall and with the notation J'^ {a, (3,9,G), etc... of 



'A 

we see that 



(61) Jf < ^Jll-{a,(3,d,G)J^^{a,(3,d,G) 

by a first application of Cauchy-Schwarz inequality. A second application of the same plus and 
(O yield 

Then Fubini and the change of variable x <^ {x — ey)/aA^/^ in ^ leads to 

(62) ji^ia,P,6,G)<fCiP). 

Then JSH) readily follow from ((SHI), dHD), 1121) and also ^ and (f^. 

It remains to prove (c). If we put together the majorations (j2fl]) . (j6T|) and ((62|l and also and 
((nSJ, we see that it is enough to prove that 

(63) limsup/r(a,/3,e,G")<^^^^ 



a2 



Let pn = A^"" , which by our assumption on A„ goes to 1. The measure G'^^ ^ admits the 
density x i-^ gn{x) = ha„{xpn)/ Pn, which converges to hp{x)] so G'^^ ^ weakly converges to the 
stable law with density hp. Then, exactly as in the previous proof, we get that 

( J hp{u)hjj{x — u9/a)du\ 
(64) SA„,e,G"{x) six) := . — • 
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On the other hand |/i/3(y)| < C(l A for some constant C, and also 5^ (y) < C(l A 

with C not depending on n. Using once more Cauchy-Schwarz, we deduce from ()6U() that 



Sa,An,G"(a;) < j gniu)hf3{x - u9/a) du 
(65) < s{x) -.= [ (lA j-j^ ) ( 1 A ,^ .:,.n^. ) 



for still another constant C, as soon as q„ > /? — e for some fixed e G (0,/3). But J s'{x)dx is finite, 
so ()64() and the dominated convergence theorem yield 

(66) /X:(cT,/3,0,G") -^^j s{x)dx. 

Finally, exactly as for (|65)) we deduce from the Cauchy-Schwarz inequality and from the fact that 
the functions Vhp and hjs/^/hp are not Lebesgue-almost surely multiple one from the other, while 
hj3 > identically, that in fact s{x) < J hp{u)hp{x — u9/a)du for all x. Therefore 



s{x)dx < J dx J hj3{u)hj3{x — u9/a)du = J hji{u)du j hi3{y)dy = J hp{y)dy = Z{(3), 
and 1)66(1 yields that holds, hence (c). 

7.5. Proof of Theorem\^ Cauchy-Schwarz inequality gives us, by (|48j) and (|51j) : 

\dePA{x\A,^,e,G)\'' < PA{x\A,(3,e,G) j G^idy) y^~hp{^^-^^. 

Plugging this into (|53|) , applying Fubini and doing the change of variable x ^ {x — Qy)laA^lf^ leads 



1 f ^ fh'.ixf 



to 

Since E{Y'^) = mA^ + (5A, we readily deduce (|^ . 

7.6. Proof of Theorem^ When 1^ = t we gave Ga = i^A- Then ()28j) follows directly from applying 
the formulae (|49|) and l|51() and from the change of variable x ^ {x — 6)/(7/S}/^ in (|53|) . after observing 
that the function hph'^/hp is integrable and odd, hence has a vanishing Lebesgue integral. 



7.7. Proof of Theorem\^ a) We first introduce some notation to be used also for the proof of 
Theorem IHl We suppose that y is a compound Poisson process with arrival rate A and law of jumps 
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H, and is the kth fold convolution of /i. So we have 



A:=0 



A:! 



1 



/ X — 9u\ 



Set 
(67) 

(68) 7i'^(fc,^) = J l^k{du) hp (^^^ J 

(69) 7a ^) = J f^kidn) u hp (^^^ j ■ 
We have (recall that /^o = <^o): 

(70) pa(x|ct, /?, 0, G) = e-^^ ^ 7!'^ (fc, ^), 

fc=0 

(71) d^pA{x\a,P,9,G) = -c-'^Y.^-^^'aHKx), 

k=0 

(72) dgpAix\a,P,e,G) = -e-^^f;^^ 7i3)(A:,x), 



fc=i 



Omitting the mention of (cr, /?, 6, G), we also set 

^'i\k,xh'i\k',x) 



(73) i = 2,3 : rJ^(A:,A;') = j 



Pa{x) 

By Cauchy-Schwarz inequality, we have 

(74) i = 2,3: T^i\k,k') < ^T^^ ik,k)T^^ (k' ,k' 



dx, T^^{k,k') = j 



il \k,x)^t>{k',x) 
PAix) 



(3), 



dx 



r'^l\k,k') <JT'^l>{k,k)T'l>{k',k 



(2), 



For any > we have pa{x) > e ^'^fi^ 7a ^(^' Therefore 



(75) 



i = 2,3 : 



ri)(fc,^)< 



^A 



kl [lf{k,x) 



(AA)fc 7 ^l\k,x) 



dx. 



Finally, if we plug dTTJ) and (O into ((HSl), we get 

(76) 



ro-e _ „~2AA 

i A C 



fc=0 1=1 

00 00 



A;! l\ 



(77) 



26 



YACINE AIT-SAHALIA AND JEAN JACOB 



b) Now we can proceed to the proof of Theorem|SJ When y is a standard Poisson process, we have 
A = 1 and fik = £k- Therefore we get 

(1) ^ 1 , ^ X — 9k\ 



(2)/, \ I y f X — 9k' 

(79) 7a (^'^) = ^w7^^/3^^AWy' 

Plugging this into ((7^ yields 

(81) < ^ T^SiKk) < J{fi). 



Recall that l|29|) follows from Theorem [21 so we need to prove (|3Ujl and (|31|) . In view of (|76l) and 
(|77p. this amounts to proving the following two properties: 



fc=0 «=1 /c=l «=1 



/c=l «=1 

If we use H74|) and (jHlf) . it is easily seen that the sum of all summands in the first (resp. second) left 
side above, except the one for A; = and / = 1 (resp. k = I = 1) goes to 0. So we are left to prove 

(82) AV2+1//3 r(J)(o, 1) ^ 0, rg)(l, l)^^ J(/3). 



Let uj = 2(J+p)^ so (1 + i)(l - w) = 1 + 1/2/3. Assume first that /3 < 2. Then if |x| < |6'/ctA1/^|'^, 
we have for some constant C £ (0, oo), possibly depending on 6, a and P, and which changes from 
an occurrence to the other, and provided A < [26 /a)^: 

hp{x) > C7A^(i+i/'3) , hi3{x + re/aA^/f^) < CA^+^/>^ 

when r G Z\{0}, and thus 

\x\ < I^/ctA^/^I^ hp{x + re/aA^/l^) < Chp{x)A^^+'^/'^'^^^-^'^ = Chp{x)A^+^/^l^ . 

When /? = 2, a simple computation on the normal density shows that the above property also holds. 
Therefore in view of (|7()|) and (|78|) we deduce that in all cases. 



e-^ ( x-e \ 



j (l + CA^/2^ . 
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By (ini) it follows that 



> : / ^' ' dx 

We readily deduce that Mxaini^^o A^+'^'l^vf [1,1) > J{l3)/a'^. On the other hand, ^ yields 
limsupA^o^^^^^'^r^^(l,l) < J{P)/(J^, and thus the second part of ^ is proved. 

Finally hf^/hp is bounded, so ^ and ^ and the fact that pa{x) > e'^hpix j a IS^I j o IS^I <^ 
yield 



rk'^(o,i) 



e 

< 



A 



and the first part of H82j) readily follows. 

7.8. Proof of Theorem\^ We use the same notation than in the previous proof, but here the 
measure fik has a density fk for all A; > 1, which further is differentiable and satisfies p2[) uniformly 
in k, while we still have = ^o- Exactly as in (|in|) . we set 



fk{u) = uf'kiu) + fk{u). 

Recall the Fisher information C defined in . which corresponds to estimating in a model where 
one observes a variable OU , with U having the law ^. Now if we have n independent variables Ui with 
the same law /i, the Fisher information associated with the observation of OUi for i = 1, . . . , n is of 
course n£, and if instead one observes only 9{Ui + ... + [/„), one gets a smaller Fisher information 
Cn < nC. In other words, we have 

(83) Cn := [ du < nC. 

Taking advantage of the fact that fi^ has a density, for all A; > 1 we can rewrite ^^{k, x) as follows 
(using further an integration by parts when i = 3 and the fact that each satisfies (|32j) ): 
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(84) 7^l\k,x) = \j hp{y)h ^^^'^ J dy 



(85) ll\k,x) = j^l hp{y)h l^^^^j dy 

(86) lt\k,x) = 1| hp{y)h l^^^y^^ dy. 
Since the /^'s satisfy uniformly in A;, we readily deduce that 

(87) fe>l, i = 1,2,3 ^ \^f{k,x)\<C, 1 ^g^^ ^^3)^^^^^^^ 

Let us start with the lower bound. Since 7^''(0, x) = -^^rjj h/^ (^^7j)' deduce from Q and 
(l87jl and ((ZOl) that 

/ ^ 1 , cr X /X 

as soon as x 7^ 0, and with the convention C2 = 0. In a similar way, we deduce from (|87j) and (|72j) : 
(89) l^^^'^^^)^-^^^©- 

Then plugging (jHH|l and (jHI^) into the last equation in (jHSl, we conclude by Fatou's Lemma and after 
a change of variable, that 

liminf — > ^ f , „, ^ '^^ '^^ ,^0, — rr^ c^a^- 



A^o A - 9^ J Xf{x)+Cf3 (Tf^/eP\x\^+l^ 



It remains to prove (|35() and the upper bound in H36j) (including when (3 = 2). By Cauchy-Schwarz, 
we get (using successively the two equivalent versions for ^^^{k,x)): 



l1\k,xf < j^^\k,x) J fikidn) hpi{x-eu)/aA^/P). 

'y^'Hk X? < 1 7«(A: X) fh,(v) M--y<rA^/')/Or , 
7a ik,x) < ^3 7^ [k,x)J hp[y) dy. 



Then it follows from H75jl and (|83jl that 



(90) ^Aik,k) < J(/3), r2)(fc,^) < £. 
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We also need an estimate for r^^(0, 1). By (|67|) and (|68|) and (|70j) we obtain pa{x) > 



e-^^ hi3{x/aAyP)/aA^/f^ and 7X^(0 
nition ((7^ to get 

rg)(o,i) 



< 



at 



/1/3 VaAi//3 



hf3{x/aA^/>^)/a'^A^/^. Then use (jHH) and the defi- 

X - yaA^/l^^ 



f 



(91) 



/ 



dx < C, 



where the last inequality comes from the facts that hp/hj^ is bounded and that / is integrable (due 
to (El). 

At this stage we use (|74|) . together with (fTHj) and (f77|) and the fact that 2\xy\ < ax"^ + y'^ /a for all 
a > 0. Taking arbitrary constants a^i > 0, we deduce from ()9U() that 



(AA)^ A; ^ ^ (AA)'=+' ( k k\ III V^^^ 



k\ 



< 



00 00 



^ k=l l=k+l 



(xaY k 
u 



a-ki 



00 00 

+ EE 



(AA)'^ / 1 



k\ 



O-kl 



k=l l=k+l 

Then if we take aki = (AA)~ for I > k, a simple computation shows that indeed 

If' < ^ (aA + CA3/2) 

for some constant C, and thus we get the upper bound in 1)36(1 . In a similar way, and replacing L/B"^ 
above by the supremum between L/O"^ and /(/?)/(T^, we see that in (|75() the sum of the absolute 
values of all summands except the one for A; = and / = 1 is smaller than a constant times A A. 
Finally, the same holds for the term for k = and 1 = 1, because of ()91j) . and this proves (|.S5|) . 



7.9. Proof of Theorem^ In the setting of Theorem Q the measure Ga admits the density y 
/ia(y/Ai/")/Ai/". For simplicity we set 

(92) n = A""I, 

(so n — > as A ^ 0), and a change of variable allows to write (|48l) as 

PA{x\a, f3AG) = j^J h, (-^ - y) K {2) ■ 

Therefore 

doPA{x\a, P, 6, G) = I h, (-^ - y) K Q) ' 
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and another change of variable in H53|) leads to 



(93) 
where 

(94) 



J- A 



a 



2a 



■''^ - J SJx) 



Ru{^) = i-kf'"' I - y)K {2) dy ={f,ri h - 4^) ^"(y) dy, 
^ Su{x) = ^ / hfsix - y)K {Z) dy = Jhp(x- ^) K{y) dy. 

Below, we denote by ii' a constant and by a continuous function on i?+ with 0(0) = 0, both of 
them changing from line to line and possibly depending on the parameters a, (3, cr, 9] if they depend 
on another parameter r] we write them as Kr^ or 0^. Recalling ©, we have 



(95) h^{x) 



ha{x) 



K{y)dy 
{\y\>\x\} a\x\ 



as X oo. 



Another application of Q when (3 <2 and of the explicit form of ^2 gives 

Khp{x) if /3<2 



(96) 



\y\ < 1 



hg{x 



< hp{x) 



K(l + x2)e-^'/2 if /3 = 2. 



In order to obtain estimates on and Su, we split the first two integrals defining these functions 
into sums of integrals on the two domains {\y\ < r/} and {\y\ > ry}, for some ry G (0, 1] to be chosen 
later. We have \hp{x — y) — hp{x) + h'p{x)y\ < /i^(x)|y2 as soon as \y\ < 1, so the fact that both 
f = ha and f = ha are even functions gives 

J{\y\<ri} ^^^^ 

On the one hand we have with f = ha or f = ha, and in view of H95|) : 

z'^\f{z)\ dz < Ku^+'^rj^-'' . 



L 



hpix - y)f {^) dy - hpy 
{\y\<v} ^^^^ 'l{\y\<v} 



< hp{x) I y- 
{\y\<v} 



■' \eu) 



dy. 



1 


f{-) 


dy = / 


l{\y\<v} 




^ J{\ 



On the other hand, the integrability of ha and ha and the fact that J ha{y)dy = yield 
J\\y\<l} 



{H<1} 



\Q idy = — / ha{z)dz = — (l + 0(u)), 

vt/n/ a J{\z\<a/eu} ^ 



{\y\<r)} 



ha f ^) dy = — 
1 \0u/ a 



{\z\<an/eu} 



ha{z)dz 



Ou 



(0n)^+" 

ha{z)dz = 2Ca ^ [l + ^r^iu)), 

^ ■'{\z\>ari/eu} a ^ T] 
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h\y\<v} - y)ha m dy - 2ca h^{x)\ < ni+- [hp{x) ^ + Kh^{x)r^'-'^) . 
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(97) <^ 



For the integrals on {\y\ > rj} we observe that by ()95() we have |/iQ(fTy/0n) — CQ(0u/cr|y|)^+" | < 
9u/ay)^^"(j){u), and the same for except that Cq is substituted with —aca- Then if 



-Dr,(2;) = / h(3{x - y) dy, 



(99) 



(98) 
we readily get 

h\y\>'^} ^f^^^ ~ ~ ^'^ -^i(^) < i:'i(x)ni+°(/>(u), 

/{|j;|>,,} /i/3(3; - y)/ia {^) dy + ac„ ^^"iC" Dr,{x) < D^{x)u^+°'(t){u). 
At this stage, if we put together (|^7)) and (I^U]). we obtain 

5„(x) - (/i^(a;) + c„ ^ Z)i(x)) I < {hp{x) + u"L>i(x)) (j){u) + Ku'^hp{x), 

Ru{x) - c„ - al),(x)) I < + Z)^(x)) (/>,(^.) + KhpixW 



(100) 



Our next step is to study the behavior at infinity of the continuous bounded and positive function 
Drj. We split the integral in H98|) into the sum of the integrals, say oi^^ and over the two domains 
{\y-x\ < and{|y| > 'q,\y-x\ > where 7 = 4/5 if /3 = 2 and 7 G ((l+a)/(l+/3), 1) if/3<2. 

On the one hand, Di^\x) < Khp{\xp), so with our choice of 7 we obviously have \x\^'^°' Di^\x) 0. 
On the other hand \x\^~^'^ Dij'\x) is clearly equivalent, as |x| — > 00, to f^\y-x\<\x\y} ^f^i^ ~y)dy^ which 
equals /||2|<|j.|7} hp{z)dz, which in turns goes to 1. Hence we get for all r/ > 0: 



(101) 



D^{x) 



\l+a 



as \x\ — > 00. 



At this stage, we can obtain the behavior of Ru and S'u as n ^ 0. First, an application of (|95() to 
the last formula in ()94() and Lebesgue theorem readily give 



(102) 



lim Su{x) = hp{x). 



Also, by (jMl), (fmni) and (tTITT]) . we get 



(103) 



Su{x) > { 



c 



C ( e-'/2 + 



if 13 < 2 
if 13 = 2 
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for some C > depending on the parameters, and all u small enough. In the same way, we see that 
u^Ru{x) 0, but this not enough. However, if = — D^i we deduce from pOU|) that for all 
r/G (0,1], 

(104) limsup \Ru{x) - Car^n{x)\ < Khp{x)rf~''^. 

A simple computation and the second order Taylor expansion with integral remainder for /i^ yield 

r,{x) =a f W-h^ij^-y) dy = -a[ \y\'-»dy [\l - v)h"p{x - yv)dv. 
J{\y\>v} \y\ J{\y\>v} Jo 

By Lebesgue theorem, converges pointwise as 77 ^ to the function r given by 



r{x) = —a / \y\^^'^dy I {1 — v)h"n{x — yv)dv . 
Jn Jo 



Then, taking into account H104|) . and using once more (jlOOf) together with (|^^ and ()101() and 
also a < P, we get 

(105) lim R4x) = Car (x), \Ru{x)\ < ^ 



u^O 1 + 



We are now in a position to prove so /? < 2 and a > (3/2 (and of course a < 13). Since a > (3/2, 
we see that (|102() , 1)103^ and (jlOSf) allow to apply Lebesgue theorem in the definition (^1]) to get that 

f ^^T^^ dx: this is ^ (obviously \r{x)\ < K/{1 + |a;|i+°), while hfsix) > C/{1 + 
for some C > 0, so the integral in H39|) converges). 



The other cases are a bit more involved, because Lebesgue theorem does not apply and we will 
see that Ju goes to infinity. First, we introduce the following functions: 



R'{x) 



aCn 



\x 



1+a ■ 



+ 



if /3 < 2 
a (3 = 2 



/2tt ' cr'»|x|i+" 

Below we denote by i(){u,T) for u G (0, 1] and L > 1 the sum </)'(n) + </>"(l/r) for any two functions 
0' and like above (changing from line to line). We deduce from (|95|) . (|96|) . (|1()()|) for r] = 1, and 
(dnO, that 



(106) 



Ixi > r 



\Ru{x) + R' {x)\ <i(,{u,T)R'{x) + < 



K 



if /3<2 
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In a similar way, we get 



\x\ > r 



\Su{x)-SUx)\ < { 



V'(n,r)5;(x) if/?<2 

ip{u,r)Si,{x) + Kou'^x^e-'^^/^ if /3 = 2, 

where Kq is some constant. Then for any (p > we denote by the smahest number bigger than 
1, such that K()x'^e~^^^'^ < tp ^^^^i+a for ah > F^. The last estimate above for (3 = 2 reads 
\Su — S'^\ < S'^{ip{u,T) + if), so in all cases we have for some fixed function TpQ as above: 



as 



(107) Suix) = S'^{x){l+pu{x)), where \puix)\ < < 



if P <2, \x\ > r 
Mu,T) + ip if /? = 2, |x| > r > To. 



At this stage, we set 



JuT 



R'{xf 
{\x\>v} S'u{x) 



dx. 



Observe that Ju = Ju,r + J2i=i '^uF^ where 

Ruixf ^ <2) f {Ru{x) + R'{x))'' 



(1) ^ r RujxY ^(2) ^ r 

J{\x\<r} Su{x) ' J^\^ 



- -2 



R'{x){Ru{x) + R'{x)) 



dx, J 



(4) 



{|x|>r} Suix 
R'ixf 



dx, 



{\x\<T} Su{x) J{\x\>r} Su{x) 

^From ()1U3() . ()1U5() and (|1U6() we get for some uq > 0: 



dx 



R'jxf 
{\x\>r} S[,{x) 



dx. 



sup 4'^<oo, ueiO,uo] /^l<K + i;{u,r)(jl% + J^A. 

«6(0,«o] ' ' V • / 



1/2 



Cauchy-Schwarz inequality yields 

Finally, (fTUH]) and (fTU7|) and the definition of R' yield (with Vo as in (fTUT)) ): 

2^o(u,F)J„,r if P<2, Vo(u,F) < 1/2 
2(Vo(^^,F) + (^)J„,rif P = 2, F > F^, Mu,r) + ^ < 1/2 



\j!:i\<{ 



Therefore we get 



1 



< 



K 



Ju' 



+ V(n,F) + 2(^o(u,r) + (^), 



as soon as tpQ{u,T) + ip < 1/2 and F > F,^, and with the convention that (p = Q and Fq = 1 when 
(3 < 2. Then, remembering that limu^o Ihnr^oo "0(^5 T) = 0, and the same for -^Qj and that (/? = 
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when P < 2 and ip is arbitrarily small when /3 = 2, we readily deduce the following fact: Suppose 
that for some function u jiu) going to +oo as n ^ 0, and independent of F, we have proved that 

(108) Ju,r ~ 7(w) as u ^ 0, Vr > 1; 
then Ju ~ 7(ii), and therefore by we get 

(109) /i^ ~ ^2^- 



The simplest case is when f3 < 2 and a < (3/2. Indeed the change of variables z = xu°'/^^ yields 

a^c^ r 1 

(HQ) = « / Z dz 

J{\z\>ru"/(0-")} Cl3\Z\ + Cat! \Z\ jO 

and we have (|in8j) with 



C/3|z|l+2"-/5 + Ca0"\z\'^+"/cr- 

So if we combine this with 1)109^ . we get (|41() . 



Suppose now that 2a = /? < 2. Then 

JuF = 2q^c?, / — ; — - dx. 

" Jr x(c/3 + Cc,6i"n°x°/o-") 

For t> > we let be the unique point x > such that Ca0'^v°'x°' /a" = cp, so in fact = p/v for 
some p > 0. We have 

Ju,r<^ -dx + ^ -4^dx<f^log{lM + K. 

On the other hand, if /i > and T < x < i/u^ we have cp + CqQ'^u'^x'^ j 0°^ < c^(l + l//u"), hence 

C/3 7r 2;(1 + l//i") Cp 1 + l//i" 

Putting together these two estimates and choosing fi big give fllOSIl with 7(u) = ^" log(l/u), and 
we readily deduce (|ifl|) . 

The case /3 = 2 is treated in pretty much the same way. We have 

JuF = 2a^c?, / 5— — r= dx. 
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We suppose that T is big enough for x ^ x^~^^°' e~^^/^ to be decreasing on [T, oo). For ?; > small 
enough, there is a unique number Hy = x > T such that CaB'^v'^ j 0°^ = x^^" e~^^/^/\/27r, so in fact 
~ Y^2a log(l/t;) when t> — > 0. Then 

- - ^"n"(2a)°/2 (log(l/n))"/2 ^ 

On the other hand, if ^ > 1 and x > H^^, we have xi+"e-^''/2/\/2^ + c„6'"u"/cj° < +c„6'"u"(l + 
//")/cr", hence 

2a2c,a^ /-g^^ 1 ^ 2ac^c7" n + 

"•^ 6'"u" ir + ^ - 6'"M°(2a)°/2 (log(l/n))°/2 + ^ ^""^ 

So again we see, by choosing ^ close to 1, that the desired result holds with 

, , 2q;Cq,(J° 
^^'^^ " ^"n"(2a)°/2 (log(l/n))"/2' 

and we deduce 
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